Complexity Measures for Map-Reduce, and Comparison to Parallel Computing

نویسندگان

  • Ashish Goel
  • Kamesh Munagala
چکیده

The programming paradigm Map-Reduce [3] and its main open-source implementation, Hadoop [1], have had an enormous impact on large scale data processing. Our goal in this expository writeup is twofold: first, we want to present some complexity measures that allow us to talk about Map-Reduce algorithms formally, and second, we want to point out why this model is actually different from other models of parallel programming, most notably the PRAM (Parallel Random Access Memory) model. We are looking for complexity measures that are detailed enough to make fine-grained distinction between different algorithms, but which also abstract away many of the implementation details.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimization of Agricultural BMPs Using a Parallel Computing Based Multi-Objective Optimization Algorithm

Beneficial Management Practices (BMPs) are important measures for reducing agricultural non-point source (NPS) pollution. However, selection of BMPs for placement in a watershed requires optimizing available resources to maximize possible water quality benefits. Due to its iterative nature, the optimization typically takes a long time to achieve the BMP trade-off results which is not desirable ...

متن کامل

Parallelizing Assignment Problem with DNA Strands

Background:Many problems of combinatorial optimization, which are solvable only in exponential time, are known to be Non-Deterministic Polynomial hard (NP-hard). With the advent of parallel machines, new opportunities have been emerged to develop the effective solutions for NP-hard problems. However, solving these problems in polynomial time needs massive parallel machines and ...

متن کامل

Low Complexity Converter for the Moduli Set {2^n+1,2^n-1,2^n} in Two-Part Residue Number System

Residue Number System is a kind of numerical systems that uses the remainder of division in several different moduli. Conversion of a number to smaller ones and carrying out parallel calculations on these numbers will increase the speed of the arithmetic operations in this system. However, the main factor that affects performance of system is hardware complexity of reverse converter. Reverse co...

متن کامل

Parallelization of genetic algorithms using Hadoop Map/Reduce

In this paper we present parallel implementation of genetic algorithm using map/reduce programming paradigm. Hadoop implementation of map/reduce library is used for this purpose. We compare our implementation with implementation presented in [1]. These two implementations are compared in solving One Max (Bit counting) problem. The comparison criteria between implementations are fitness converge...

متن کامل

Parallel computing using MPI and OpenMP on self-configured platform, UMZHPC.

Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1211.6526  شماره 

صفحات  -

تاریخ انتشار 2012